Statistical Inference After Model Selection∗
نویسندگان
چکیده
Conventional statistical inference requires that a model of how the data were generated be known before the data are analyzed. Yet in criminology, and in the social sciences more broadly, a variety of model selection procedures are routinely undertaken followed by statistical tests and confidence intervals computed for a “final” model. In this paper, we examine such practices and show how they are typically misguided. The parameters being estimated are no longer well defined, and post-model-selection sampling distributions are mixtures ∗Richard Berk’s work on this paper was funded by a grant from the National Science Foundation: SES-0437169, “Ensemble methods for Data Analysis in the Behavioral, Social and Economic Sciences.” The work by Lawrence Brown and Linda Zhao was support in part by NSF grant DMS-07-07033. Thanks also go to Andreas Buja, Sam Preston, Jasjeet Sekhon, Herb Smith, Phillip Stark, and three reviewers for helpful suggestions about the material discussed in this paper.
منابع مشابه
Statistical Inference in Autoregressive Models with Non-negative Residuals
Normal residual is one of the usual assumptions of autoregressive models but in practice sometimes we are faced with non-negative residuals case. In this paper we consider some autoregressive models with non-negative residuals as competing models and we have derived the maximum likelihood estimators of parameters based on the modified approach and EM algorithm for the competing models. Also,...
متن کاملBayesian Methods to Impute Missing Covariates for Causal Inference and Model Selection
BAYESIAN METHODS TO IMPUTE MISSING COVARIATES FOR CAUSAL INFERENCE AND MODEL SELECTION by Robin Mitra Department of Statistical Science Duke University
متن کاملModel Selection in Adaptive Neuro Fuzzy Inference System (ANFIS) by using Inference of R Incremental for Time Series Forecasting
The aim of this paper is to propose a procedure for model selection in Adaptive Neuro-Fuzzy Inference System (ANFIS) for time series forecasting. In this paper, we focus on the model selection based on statistical inference of R incremental. The selecting model is conducted by evaluating the inputs, number of membership functions and rules in architecture of ANFIS until the contribution of R2 i...
متن کاملPOST - SELECTION INFERENCE By Richard Berk
It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to ...
متن کاملPOST - SELECTION INFERENCE By Richard
It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to ...
متن کاملPost - Selection Inference
It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to ...
متن کامل